Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach

نویسندگان

  • Riaz Ahmad
  • Saeeda Naz
  • Muhammad Zeshan Afzal
  • Sayed Hassan Amin
  • Thomas Breuel
  • Rongrong Ji
چکیده

The presence of a large number of unique shapes called ligatures in cursive languages, along with variations due to scaling, orientation and location provides one of the most challenging pattern recognition problems. Recognition of the large number of ligatures is often a complicated task in oriental languages such as Pashto, Urdu, Persian and Arabic. Research on cursive script recognition often ignores the fact that scaling, orientation, location and font variations are common in printed cursive text. Therefore, these variations are not included in image databases and in experimental evaluations. This research uncovers challenges faced by Arabic cursive script recognition in a holistic framework by considering Pashto as a test case, because Pashto language has larger alphabet set than Arabic, Persian and Urdu. A database containing 8000 images of 1000 unique ligatures having scaling, orientation and location variations is introduced. In this article, a feature space based on scale invariant feature transform (SIFT) along with a segmentation framework has been proposed for overcoming the above mentioned challenges. The experimental results show a significantly improved performance of proposed scheme over traditional feature extraction techniques such as principal component analysis (PCA).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Extraction Using Zernike Moments

Shape identification and feature extraction are the main concern of any pattern recognition system. Object parameters are mostly dependent on spatio-temporal relationships among the pixels. However feature extraction is a complex phenomenon which needs to be addressed from the invariance property, irrespective of position and orientation. Zernike moments are used as shape descriptors and identi...

متن کامل

Semi-Automated Transcription Generation for Pashto Cursive Script

Usually, a large amount of transcription data is required for training and benchmarking Optical Character Recognition (OCR) systems for new scripts like Pashto. In case of real image data; mostly the images are acquired through scanning. For supervised training scenarios, it is required to have a ground truth against the corresponding scanned images. Usually, the ground truth is created by tran...

متن کامل

Offline Handwritten MODI Character Recognition Using HU, Zernike Moments and Zoning

HOCR is abbreviated as Handwritten Optical Character Recognition. HOCR is a process of recognition of different handwritten characters from a digital image of documents. Handwritten automatic character recognition has attracted many researchers all over the world to contribute handwritten character recognition domain. Shape identification and feature extraction is very important part of any cha...

متن کامل

Cursive Script Postal Address Recognition Abstract Cursive Script Postal Address Recognition

Cursive Script Postal Address Recognition By Prasun Sinha Large variations in writing styles and di culty in segmenting cursive words are the main reasons for cursive script postal address recognition being a challenging task A scheme for locating and recognizing words based on over segmentation followed by dynamic programming is proposed This technique is being used for zip code extraction as ...

متن کامل

A New Approach to Segmentation of Persian Cursive Script based on Adjustment the Fragments

Optical Character Recognition (OCR) is a very old and of great interest in pattern recognition field. The recognition of cursive scripts like Persian and Arabic languages is a difficult task as their segmentation suffers from serious problems in different languages. Segmentation is a process of dividing cursive words into smaller parts in order to decrease complexity and increase accuracy of re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2015